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Abstract. We consider a dynamic vehicle routing problem with time windows 
and stochastic customers (DS-VRPTW), such that customers may request for ser¬ 
vices as vehicles have already started their tours. To solve this problem, the goal 
is to provide a decision rule for choosing, at each time step, the next action to 
perform in light of known requests and probabilistic knowledge on requests like¬ 
lihood. We introduce a new decision rule, called Global Stochastic Assessment 
(GSA) rule for the DS-VRPTW, and we compare it with existing decision rules, 
such as MSA. In particular, we show that GSA fully integrates nonanticipativity 
constraints so that it leads to better decisions in our stochastic context. We de¬ 
scribe a new heuristic approach for efficiently approximating our GSA rule. We 
introduce a new waiting strategy. Experiments on dynamic and stochastic bench¬ 
marks, which include instances of different degrees of dynamism, show that not 
only our approach is competitive with state-of-the-art methods, but also enables 
to compute meaningful offline solutions to fully dynamic problems where abso¬ 
lutely no a priori customer request is provided. 


1 Introduction 

This paper is an extended version of study (30), and contains additional detailed exper¬ 
imental results. 

Dynamic (or online) vehicle routing problems (D-VRPs) arise when information 
about demands is incomplete, e.g., whenever a customer is able to submit a request 
during the online execution of a solution. D-VRP instances usually indicate the deter¬ 
ministic requests, i.e., those that are known before the online process if any. Whenever 
some additional (stochastic) knowledge about unknown requests is available, the prob¬ 
lem is said to be stochastic. We focus on the Dynamic and Stochastic VRP with Time 
Windows (DS-VRPTW). These problems arise in many practical situations, as door-to- 
door or door-to-hospital transportation of elderly or disabled persons. In many coun¬ 
tries, authorities try to set up dial-a-ride services, but escalating operating costs and 
the complexity of satisfying all customer demands become rapidly unmanageable for 
solution methods based on human choices DSD- However, such complex dynamic prob¬ 
lems need reliable and efficient algorithms that should first be assessed on reference 
problems, such as the DS-VRPTW. 


In this paper, we present a new heuristic method for solving the DS-VRPTW, based 
on a Stochastic Programming modeling. By definition, our approach enables a higher 
level of anticipation than heuristic state-of-the-art methods. The resulting new online 
decision rule, called Global Stochastic Assessment (GSA), comes with a theoretical 
analysis that clearly defines the nature of the method. We propose a new waiting strategy 
together with a heuristic algorithm that embeds GSA. We compare GSA with the state- 
of-the-art method MSA from 0, and provide a comprehensive experimental study that 
highlights the contributions of existing and new waiting and relocation strategies. 

This paper is organized as follows. Section [2] describes the problem. Section [3] 
presents the state-of-the-art method we compare to and briefly discuss related works. 
GSA is then presented in Section [4] Section [5] describes an implementation that em¬ 
beds GSA, based on heuristic local search. Finally, section[6]resumes the experimental 
results. A conclusion follows in section [7] 

2 Description of the DS-VRPTW 

Notations. We note [l,u] the set of all integer values i such that l < i < u. A se¬ 
quence < x l ,x l + x ,... ,x z+k > (with fc > 0) is noted x l - z+k , and the concatenation 
of two sequences x and x J+l ' fc (with i < j < k) is noted x l 'fx^ +x " k . Random 
variables are noted £ and their realizations are noted £. We note £ £ £ the fact that 
£ is a realization of £, and p(£ = £) the probability that the random variable £ is re¬ 
alized to £. Finally, we note E^ [/(£)] the expected value of /(£) which is defined by 

WO] = £***>(€ = 0 •/(£)■ 

Input Data of a DS-VRPTW. We consider a discrete time horizon [1, H] such that each 
online event or decision occurs at a discrete time t £ [1, //], whereas each offline event 
or decision occurs at time t = 0. The DS-VRPTW is defined on a complete and directed 
graph G = ( V., E). The set of vertices V = [0, n] is composed of a depot (vertex 0) and 
n customer regions (vertices 1 to n). To each arc (i,j) £ E is associated a travel time 
tij £ R>o, that is the time needed by a vehicle to travel from i to j, with fj tjj in 
general. To each customer region i £ [l,n] is associated a load <•/,, a service duration 
di £ [1, H] and a time window [e*, Z»] with e.;, f £ [1, H] and e* < U. 

The set of all customer requests is R C [1, n] x [0, H], For each request (i, t ) £ R, 
t is the time when the request is revealed. When t = 0, the request is known before the 
online execution and it is said to be deterministic. When t > 0, the request is revealed 
during the online execution at time t and it is said to be online (or dynamic). There may 
be several requests for a same vertex i which are revealed at different times. During 
the online execution, we only know a subset of the requests of R ( i. e. , those which have 
already been revealed). However, for each time t £ [1, IT], we are provided a probability 
vector P* such that, for each vertex i £ [1, n], P t [i ] is the probability that a request is 
revealed for i at time t. 

There are k vehicles and all vehicles have the same capacity Q. 

Solution of a DS-VRPTW. At the end of the time horizon, a solution is a subset of 
requests R a C R together with k routes (one for each vehicle). Requests in R a are said 
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to be accepted, whereas requests in R \ R a are said to be rejected. The routes must 
satisfy the constraints of the classical VRPTW restricted to the subset R a of accepted 
requests, i.e., each route must start at the depot at a time t > 1 and end at the depot at a 
time t' < H, and for each accepted request (*, t) £ R a , there must be exactly one route 
that arrives at vertex * at a time t' £ [ei, li\ with a current load l < Q — qt and leaves 
vertex i at a time t" > t' + d,. The goal is to minimize the number of rejected requests. 

As not all requests are known at time 0, the solution cannot be computed offline, and 
it is computed during the online execution. More precisely, at each time t £ [1, H], an 
action a* is computed. Each action a 4 is composed of two parts: first, for each request 
(i. t) £ R revealed at time t, the action a 4 must either accept the request or reject it; 
second, for each vehicle, the action o 4 must give operational decisions for this vehicle 
at time t (i.e., service a request, travel towards a vertex, or wait at its current position). 
Before the online execution (at time 0), some decisions are computed offline. Therefore, 
we also have to compute a first action a 0 . 

A solution is a sequence of actions a°' H which covers the whole time horizon. 
This sequence must satisfy VRPTW constraints, i.e., the actions of a°' H must define 
k routes such that each request accepted in a°' H is served once by one of these routes 
within the time window associated with the served vertex and without violating capacity 
constraints. We define the objective function w such that w(a 0 ' 4 ) is +oo if a 0 " 4 does 
not satisfy VRPTW constraints, and w(a 0 ' 4 ) is the number of requests rejected in a 0 " 4 
otherwise. Hence, a solution is a sequence a°' H such that is minimal at the 

end of the horizon. 


Stochastic program. There are different notations used for formulating stochastic pro¬ 
grams; we mainly use those from [8]. For each time t £ [l,fT], we have a vector of 
random variables £' such that, for each vertex i £ [1, n], £ 4 [i] is realized to 1 if a re¬ 
quest for i is revealed at time t, and to 0 otherwise. The probability distribution of is 
defined by P 4 , i.e., p(£ 4 [i] = 1) = P 4 [i] andp(£ 4 [i] = 0) = 1 — P 4 [*]. We note £ 1 " 4 the 
random matrix composed of the random vectors £ 1 to £ 4 . A realization A - n g gi-.H 
is called a scenario. 

At each time t. £ [1, //], the action a 4 must contain one accept or reject for each 
request which is revealed in £ 4 . Therefore, we note A(£ 4 ) the set of all actions that con¬ 
tain an accept or a reject for each vertex i £ [1, n] such that £ 4 [i] = 1. Of course, these 
actions also contain other decisions related to the k vehicles. We also note A(£ 41 -- 42 ) 
the sequence of sets < A(£ 41 ),..., A(£ 42 ) > where tl < t2. 

Hence, at each time t, given the sequence a 0 ’ 4-1 of past actions, the best action a 4 
is obtained by solving the multistage stochastic problem defined by eq. 0: 


t 


a 


argmin E*t+i min Ett+2 
a t<=A(£‘) ' L“ t+1 6 A (€ t+1 ) 



min E t* 

/ O..H\ 
mm u{a ) 


a H-l 6 A(£»-l) ? 

a H eA(i H ) 


(1) 


Note that this multistage stochastic problem is different from the two-stage stochastic 
problem defined by eq. ([2]): 


a 4 = argmin Ect+i.nf min u>(a°" H ) ] 

o*eA(C*) 4 l a .+i-«6AgHi..H) 


( 2 ) 
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At time t = 1, there is only 1 vehicle which is 
on vertex a, and we have to choose between 2 
possible actions: travel to b or travel to c 


Fig. 1. A simple example of nonanticipation. The graph is displayed on the left. Time windows 
are displayed in brackets. For every couple of vertices if an arrow * — > j is displayed then 
tij = 2; otherwise = 20. To simplify, we consider only 2 equiprobable scenarios (displayed 
on the right). These scenarios have the same prefix (at times 2 and 3 no request is revealed) but 
reveal different requests at time 4. When using eq. (1) at time t = 1, we choose to travel to c as 
the expected cost with nonanticipativity constraints is 1: At time 4, only one scenario will remain 
and if this scenario is £i (resp. £ 2 ), request (d, 4) (resp. (g, 4)) will be rejected. When using eq. 
(2), we choose to travel to b as the expected cost without nonanticipativity constraints is 0 (for 
each possible scenario, there exists a sequence of actions which serves all requests: travel to d, e, 
and / for £1 and travel to g, h, and i for £ 2 )- However, if we travel to b, at time 3 we will have to 
choose between traveling to d or g and at this time the expected cost of both actions will be 1.5: 
If we travel to d (resp. g), the cost with scenario £1 is 0 (resp. 3) and the cost with scenario £2 is 
3 (resp. 0). In this example, the nonanticipativity contraints of multistage problem 0 thus leads 
to a better action than the two-stage relaxation 0- 


Indeed, eq. 0 enforces nonanticipativity constraints so that, at each time t' > t, we 
consider the action a* which minimizes the expectation with respect to only, with¬ 
out considering the possible realizations of g 4 +1 - H . Eq. ({ 2 ]) does not enforce these con¬ 
straints and considers the best sequence a t+1 " H for each realization ^ t + 1 - H g £ ,+1 - H . 
Therefore, eq. 0 may lead to a larger expectation of ui than eq. ([2]), as it is more con¬ 
strained. However, the expectation computed in eq. 0 leads to better decisions in our 
context where some requests are not revealed at time t. This is illustrated in Fig.0 

3 Related Work 

The first D-VRP is proposed in l29l . which introduces a single vehicle Dynamic Dial¬ 
a-Ride Problem (D-DARP) in which customer requests appear dynamically. Then, l20l 
introduced the concept of immediate requests that must be serviced as soon as possi¬ 
ble, implying a replanning of the current vehicle route. Complete reviews on D-VRP 
may be found in 1211181 . In this section, we more particularly focus on stochastic D- 
VRP fl8l classifies approaches for stochastic D-VRP in two categories, either based on 
stochastic modeling or on sampling. Stochastic modeling approaches formally capture 
the stochastic nature of the problem, so that solutions are computed in the light of an 
overall stochastic context. Such holistic approaches usually require strong assumptions 
and efficient computation of complex expected values. Sampling approaches try to cap¬ 
ture stochastic knowledge by sampling scenarios, so that they tend to be more focused 
on local stochastic evidences. Their local decisions however allow sample-based meth¬ 
ods to scale up to larger problem instances, even under challenging timing constraints. 
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Algorithm 1: The ChooseRequest-e Expectation Algorithm 

1 for a* £ A(£ 4 ) do /(a 1 ) <— 0 Generate a set S' of a scenarios using Monte Carlo 
sampling 

2 for each scenario s € S and each action a* £ A(£ 4 ) do 

3 |_ /(a*) /(a 4 )+cost of (approximate) solution to scenario s starting with a* 

4 return argmin atgA(et) /(a 4 ) 


One usually needs to find a good compromise between having a high number of sce¬ 
narios, providing a better representation of the real distributions, and a more restricted 
number of these leading to less computational effort. 

0 studies the DS-VRPTW and introduces the Multiple Scenario Approach (MSA). 
A key element of MSA is an adaptive memory that stores a pool of solutions. Each so¬ 
lution is computed by considering a particular scenario which is optimized for a few 
seconds. The pool is continuously populated and filtered such that all solutions are con¬ 
sistent with the current system state. Another important element of MSA is the ranking 
function used to make operational decisions involving idle vehicles. The authors de¬ 
signed 3 algorithms for that purpose: 

- Expectation EH samples a set of scenarios and selects the next request to be ser¬ 
viced by considering its average cost on the sampled set of scenarios. Algorithm 
0E2 depicts how it chooses the next action a* to perform. It requires an opti¬ 
mization for each action o 4 £ A(£ 4 ) and each scenario s £ S (lines 3-4), which is 
computationally very expensive, even with a heuristic approach. 

- Regret 13161 approximates the expectation algorithm by recognizing that, given a 
solution sol* to a particular scenario s, it is possible to compute a good approxima¬ 
tion of the local loss inquired by performing another action than the next planned 
one in so(*. 

- Consensus «4171 selects the request that appears the most frequently as the next 
serviced request in the solution pool. 

Quite similar to the consensus algorithm is the Dynamic Sample Scenario Hedging 
Heuristic introduced by 01 for the stochastic VRR Also, m designed a Tabu Search 
heuristic for the DS-VRPTW and introduced a vehicle-waiting strategy computed on a 
future request probability threshold in the near region. Finally, 0 extends MSA with 
waiting and relocation strategies so that the vehicles are now able to relocate to promis¬ 
ing but unrequested yet vertices. As the performances of MSA has been demonstrated in 
several studies 151121221191 . it is still considered as a state-of-the-art method for dealing 
with DS-VRPTW. 

Other studies of particular interest for our paper are tm on the dynamic and 
stochastic pickup and delivery problem, and l22l . on the DS-DARP. Both consider local 
search based algorithms. Instead of a solution pool, they exploit one single solution that 
minimizes the expected cost over a set of scenarios. However, in order to limit com¬ 
putational effort, only near future requests are sampled within each scenario. Although 
the approach of l22l is similar to the one of fl3l . the set of scenarios considered is 
reduced to one scenario. Although these later papers show some similarities with the 
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approach we propose, they do not provide any mathematical motivation and analysis of 
their methods. 


4 The global Stochastic Assessment decision rule 


The two-stage stochastic problem defined by eq. ([2]) may be solved by a sampling solv¬ 
ing method such as MSA, which solves a deterministic VRPTW for each possible sce¬ 
nario (i.e., realization of the random variables) and selects the action a* which mini¬ 
mizes the sum of all minimum objective function values weighted by scenario proba¬ 
bilities. However, we have shown in Section 2 that eq. (2) does not enforce nonanticipa- 
tivity constraints because the different deterministic VRPTW are solved independently. 
To enforce nonanticipativity constraints while enabling sampling methods, we push 
these constraints in the computation of the optimal solutions for all different scenarios: 
Instead of computing these different optimal solutions independently, we propose to 
compute them all together so that we can ensure that whenever two scenarios share a 
same prefix of realizations, the corresponding actions are enforced to be equal. 

At each time t G [0. //], let r be the number of different possible realizations of 
gt+i..H, an( j ] et us note £t+i..H,''' ,£t+i..H j-jjggg realizations. Given the sequence 
a o..t-i p ast ac tj onSi we choose action a* by using eq. (|3]> 

a* = argmin Q(a° " t , {£r +1 " i/ ,... , £ 4+1 " H }) (3) 

a*eA(£*) 

which is called the deterministic equivalent form of eq. Q. 

Q(a 0 "*, {£i +1 " ,..., £* +1 "^}) solves the deterministic optimization problem 


mm 




t+l..H_ct+l..H 


=c 

^2 


0 ..t „t+l..H 


uj[a " .a, 


) (4) 


2=1 


s.t. =► (a‘ +1 " 4 ' =a* +1 - 4 '), Vi' G [t+l,H], Vi,j G [l,r] (5) 


The nonanticipativity constraints (pb state that, when 2 realizations and ^* + l 11 

share a same prefix from t + 1 to r, the corresponding actions must be equal [231. 

Solving eq. 0 is computationally intractable for two reasons. First, since the num¬ 
ber r of possible realizations of £ t+1 " H is exponential in the number of vertices and 
in the remaining horizon size H — t, considering every possible scenario is intractable 
in practice. We therefore consider a smaller set of a scenarios S = {si, ...,s a } such 
that each scenario s* G S is a realization of £ t+1 " H , i.e., \/i G [l,a],Sj G £ I+1 - H m 
This set S is obtained by Monte Carlo sampling [2]. All elements of S share the same 
probability, i.e., p(£ t+1 " H = si) = ... = p(£ t+1 " H = s a ). 

Second, solving eq. 0 basically involves solving to optimality problem Q for each 
possible action a 4 G A(£ 4 ). Each problem Q involves solving a VRPTW for each pos¬ 
sible scenario of S, while ensuring nonanticipativity constraints between the different 
solutions. As the VRPTW problem is an ,/V'P-hard problem, we propose to compute an 
upper bound Q of Q based on a given sequence a f R L,/ of future route actions. Because 
we impose the sequence a R " , the set of possible actions at time t is limited to those 
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directly compatible with it, denoted A(£ 4 , a^ 1-1 " H ) C A(£ 4 ). That limitation enforces 
< +oo. This finally leads to the GSA decision rule: 

(GSA) a 4 = argmin Q(a°" t ,a t R 1 " H , S) (6) 

a*eA(?,a t + 1 3 4 5 6 7 8 9 ~ H ) 

which, provided realization £ 4 , sampled scenarios S and future route actions 
selects the action a* that minimizes the expected approximate cost over scenarios S. 
Notice that almost all the anticipative efficiency of the GSA decision rule relies on the 
sequence a*^ 1 ' H , which directly affects the quality of the upper bound Q. 

Sequence a t R 1 " H of future route actions. This sequence is used to compute an upper 
bound of Q. For each time t' £ [t + 1, H), the route action a f R only contains operational 
decisions related to vehicle routing (i.e ., for each vehicle, travel towards a vertex, or wait 
at its current position) and does not contain decisions related to requests (i.e., request 
acceptance or rejection). The more flexible a R with respect to S , the better the bound 
Q. We describe in Section[5]how a flexible sequence is computed through local search. 

Computation of an upper bound Q of Q. Algorithm [2] depicts the computation of 
an upper bound Q of Q given a sequence 4 +1 -* of route actions consistent with past 
actions a 0 • 4 . For each scenario s 7 ; of S, Algorithm [2] builds a sequence b°~ H for Sj, 
which starts with a 0 - 4 , and whose end b t+1 ' H is computed from 4 +1 " H in a greedy 
way. At each time t! £ [t + l..H\, each request revealed at time t' in scenario .s, is 
accepted if it is possible to modify b l " U so that one vehicle can service it; it is rejected 
otherwise. One can consider 6 4 " U as being a set of vehicle routes, each defined by 
a sequence of planned vertices. Each planned vertex comes with specific decisions: a 
waiting time and whether a service is performed. In this context, trytoServe performs 
a deterministic linear time modification of 6 4 " U such that (j. t') corresponds to the 
insertion of the vertex j in one of the routes defined by h 4 ~~ n , at the best position 
with respect to VRPTW constraints and travel times, without modifying the order of 
the remaining vertices. At the end, Algorithm [2] returns the average number of rejected 


Algorithm 2: The Q(a°" t ,a t R 1 ' H , S) approximation function 

1 Precondition: " H is a sequence of route actions consistent with a°" t 

2 for each scenario Si £ S do 

3 nbRejected[i\ <— 0 ; 6 ° " 4 «— a 0 " 4 ; b t+1 ' H t— ajj 1 " 1 " H 

4 for t! £ [t + 1 ,.H\ do 

5 for each request ( j, t ') revealed at time t' for a vertex j in scenario Si do 

6 c 4 " H 4— trytoServe^jjt'),^ " H ) 

7 if . c 4 '- H is feasible then b*' " H <— c*' " H 

8 else add the decision reject(j,t’) to fe 4 and increment nbRejected[i] 

9 return ^ ■ X] s -es tibRejected[i] 
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requests for all scenarios. Note that, when modifying a sequence of actions so that a 
request can be accepted (line[6|, actions b f " U can be modified, but b°" t _1 are not 
modified. This ensures that Q preserves the nonanticipativity constraints. Indeed, the 
fact that two identical scenarios prefixes could be assigned two different subsequences 
of actions implies that either trytoServe((j,t '), b 4 ■ H ) is able to modify an action 

b t<t 

or is a nondeterministic function. In both cases, there is a contradiction. Finally, 
notice that contrary to other local search methods based on Monte Carlo simulation 
as in 11131221 . GSA considers the whole timing horizon when evaluating a first-stage 
solution against a scenario. 


Comparison to MSA GSA has two major differences with MSA. Given a set of scenar¬ 
ios, GSA maintains only one solution, namely the sequence a^ 1 ' H , that best suits to a 
pool of scenarios whilst MSA computes a set of solutions, each specialized to one sce¬ 
nario from the pool. Furthermore, by preserving nonanticipativity GSA approximates 
the multistage problem of equations m- In contrary, MSA relaxes these constraints 
and therefore approximates the two-stage problem 0 E3- 

In particular, given a pool of scenarios obtained by Monte Carlo sampling, MSA Ex¬ 
pectation Algorithm |T| reformulates eq. Q as a sample average approximation (SAA, 
1111281 ) problem. The SAA tackles each scenario as a separate deterministic problem. 
For a specific scenario ^ t + 1 - H j it considers the recourse cost of a solution starting with 
actions a 0 ' *. Because the scenarios are not linked by nonanticipativity constraints, two 
scenarios i and j that share the same prefix £ 4+1 " 4 can actually be assigned two so¬ 
lutions performing completely different actions a°" 4 and a°" 4 , for some t! > t. The 
evaluation of action a 4 over the set of scenarios is therefore too optimistic, leading to 
a suboptimal choice. By definition, the Regret algorithm approximates the Expecta¬ 
tion algorithm. The Regret algorithm then also approximates a two-stage problem. The 
Consensus algorithm selects the most suggested action among plans of the pool. By 
selecting the most frequent action in the pool. Consensus somehow encourages nonan¬ 
ticipation. However, the nonanticipativity constraints are not enforced as each scenario 
is solved separately. Consensus also approximates a two-stage problem. 


5 Solving the Dynamic and Stochastic VRPTW 


GSA alone does not permit to solve a DS-VRPTW instance. In this section, we now 
show how the decision rule, as defined in eq. [6} can be embedded in an online algorithm 
that solves the DS-VRPTW. Finally, we present the different waiting and relocation 
strategies we exploit, including a new waiting strategy. 


5.1 Embedding GSA 

In order to solve the DS-VRPTW, we design Algorithm [3] which embeds the GSA 
decision rule. 
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Algorithm 3: LS-based GSA 


1 Initialize S with a scenarios and compute initial solution a]^' H w.r.t. known requests 

2 t i — 1; 

3 while real time has not reached the end of the time horizon do 

/* Beginning of the time unit */ 

4 (a* <r- handleRequests(a°" t_1 , cfe 11 , £ 4 ) 

s execute action a* and update the pool S of scenarios w.r.t. to £* 

/* Remaining of the time unit */ 

6 while real time has not reached the end of time unit t do 

7 ^r 1 H shakeSolution(ay _1 " H ) 

8 if Q(a° b t ft 1 " H , S) < Q(a° S) then <4 +1 H 4- b# 1 " 11 it the 

number of iterations since the last re-initialization of S is equal to fi then 

9 Re-initialize the pool S of scenarios w.r.t. ^ t + 1 - H 


to t 4— t + 1 /* Skip to next time unit 

n Function handleRequests (a° ' <_1 , ) 

12 b t— a ; b t— an 

13 for each request revealed for a vertex j in realization do 

14 if we find, in less than Sins, how to modify b*" 11 s.t. request ( j , t) is sen’ed then 
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16 
17 


modify b* " H to accept request (j, t) 


else 


j modify b* ' 11 to reject request (j, t) 


is return (If, b t+1 " H ) 


Main Algorithm. It is parameterized by: a which determines the size of the pool S of 
scenarios; (3 which determines the frequency for re-initializing S; and 5 ins which limits 
the time spent for trying to insert a request in a sequence. 

It runs in real time. It is started before the beginning of the time horizon, in order 
to compute an initial pool 5 of a scenarios and an initial solution a) t ' n with respect to 
offline requests (revealed at time 0). It runs during the whole time horizon, and loops 
on lines 3 to 11. It is stopped when reaching the end of the time horizon. The real 
time is discretized in H time units, and the variable t represents the current time unit: 
It is incremented when real time exceeds the end of the t th time unit. In order to be 
correct. Algorithm [3] requires the real computation time of lines 4 to 11 to be smaller 
than the real time spent in one time unit. This is achieved by choosing suitable values 
for parameters a and S rns . 

Lines 4 and 5 describe what happens whenever the algorithm enters a new time unit: 
Function handleRequest s (described below) chooses the next action a 4 and updates 
a^ ] ■ // ; Finally, S is updated such that it stays coherent with respect to realization £*. 
Each scenario £ S is composed of a sequence of sampled requests. To each 

customer region i is associated an upper bound r* = min(^o — L,o — rf*, — to,i) °n the 

time unit at which a request can be revealed in that region, like in J7J. That constraint 
prevents tricky or inserviceable requests to be sampled. At time t, a sampled request 
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(i, t ) which doesn’t appear in £* is either removed if t > r, or randomly delayed in 




£ S otherwise. 


The algorithm spends the rest of the time unit to iterate over lines 7 to 10, in order to 
improve the sequence of future route actions . We consider a hill climbing strat¬ 
egy: The current solution is shaked to obtain a new candidate solution b^ 1 " H , 

and if this solution leads to a better upper bound Q of Q, then it becomes the new 
current solution. Shaking is performed by the shakeSolution function. This func¬ 
tion considers different neighborhoods, corresponding to the following move operators: 
relocate, swap, inverted 2-opt, and cross-exchange (see 1161261 for complete descrip¬ 
tions). As explained in Section 5.2 depending on the chosen waiting and relocation 
strategy, additional move operators are exploited. At each call to the shakeSolution 
function, the considered move operator is changed, such that the operators are equally 
selected one after another in the list. Every f3 iterations, the pool S of scenarios is re¬ 
sampled (lines 9-10). This re-sampling introduces diversification as the upper bound 
computed by Q changes. We therefore do not need any other meta-heuristic such as 
Simulated Annealing. 


Function handleRequest is called at the beginning of a new time unit t, to compute 
action a* in light of online requests (if any). It implements the GS A decision rule defined 
in eq. <|§- The function considers each request revealed at time t for a vertex j, in a 
sequential way. For each request, it tries to insert it into the sequence a^ H (i.e., modify 
the routes so that a vehicle visits j during its time window). As in shakeSolution, 
local search operations are performed during that computation. The time spent to find a 
feasible solution including the new request is limited to Si ns . If such a feasible solution 
is found, then the request is accepted, otherwise it is rejected. If there are several online 
requests for the same discretized time t, we process these requests in their real-time 
order of arrival, and we assume that all requests are revealed at different real times. 


5.2 Waiting and Relocation strategies 

As defined in section 2, a vehicle that just visited a vertex usually has the choice be¬ 
tween traveling right away to the next planned vertex or first waiting for some time at 
its current position. Unlike in the static (and deterministic) case, in the dynamic (and 
stochastic) VRPTW these choices may have a significant impact on the solution quality. 

Waiting and relocation strategies have attracted a great interest on dynamic and 
stochastic VRP’s. In this section, we present and describe how waiting and reloca¬ 
tion strategies are integrated to our framework, including a new waiting strategy called 
relocation-only. 

Relocation strategies Studies in 18191 already showed that for a dynamic VRP with no 
stochastic information, it is optimal to relocate the vehicle(s) either to the center (in case 
of single-vehicle) or to strategical points (multiple-vehicle case) of the service region. 
The idea evolved and has been successfully adapted to routing problems with customer 
stochastic information, in reoptimization approaches as well as sampling approaches. 

Relocation strategies explore solutions obtained when allowing a vehicle to move 
towards a customer vertex even if there is no request received for that vertex at the 
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current time slice. Doing so, one recognizes the fact that, in the context of dynamic and 
stochastic vehicle routing, a higher level of anticipation can be obtained by considering 
to reposition the vehicle after having serviced a request to a more stochastically fruitful 
location. Such a relocation strategy has already been applied to the DS-VRPTW in |5il . 


Waiting strategies In a dynamic context, the planning of a vehicle usually contains 
more time than needed for traveling and servicing requests. When it finishes to service 
a request, a vehicle has the choice between waiting for some time at its location or 
leaving for the next planned vertex. A good strategy for deciding where and how long to 
wait can potentially help at anticipating future requests and hence increase the dynamic 
performances. We consider three existing waiting strategies and introduce a new one: 

- Drive-First (DF): The basic strategy aims at leaving each serviced request as soon 
as possible, and possibly wait at the next vertex before servicing it if the vehicle 
arrives before its time window. 

- Wait-First (WF): Another classical waiting strategy consists in delaying as much 
as possible the service time of every planned requests, without violating their time 
windows. After having serviced a request, the vehicle hence waits as long as possi¬ 
ble before moving to the next planned request. 

- Custom-Wait (CW): A more tailored waiting strategy aims at controlling the wait¬ 
ing time at each vertex, which becomes part of the online decisions. 

- Relocation-Only waiting (RO): In order to take maximum benefit of relocation 
strategy while avoiding the computational overhead due to additional decision vari¬ 
ables involved in custom waiting, we introduce a new waiting strategy. It basically 
applies drive-first scheduling to every request and then applies wait-first waiting 
only to those requests that follow a relocation one. By doing so, a vehicle will try 
to arrive as soon as possible at a planned relocation request , and wait there as long 
as possible. In contrary, it will spend as less time as possible at non-relocation re¬ 
quest vertices. Note that if it is not coupled to a relocation strategy, RO reduces 
to DF. Furthermore, RO also reduces to the dynamic waiting strategy described 
in 02 if we define the service zones as being delimited by relocation requests. 
However, our strategy differs by the fact that service zones in our approach are 
computed in light of stochastic information instead of geometrical considerations. 

Depending on the waiting strategy we apply and whether we use relocation or not, addi¬ 
tional LS move operators are exploited. Specifically, among the waiting strategies, only 
custom-wait requires additional move operators aiming at either increasing or decreas¬ 
ing the waiting time at a random planned vertex. Relocation also requires two additional 
move operators that modify a given solution by either inserting or removing a relocation 
action at a random vertex. 


6 Experimentations 

We now describe our experimentations and compare our results with those of the state 
of the art MSA algorithm of Q. 
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6.1 Algorithms 

Different versions of Algorithm [3] have been experimentally assessed, depending on 
which waiting strategy is implemented and whether in addition we use the relocation 
strategy or not. 

Surprisingly, the wait-first waiting strategy, as well as its version including relo¬ 
cation , produced very bad results in comparison to other strategies, rejecting more 
than twice more online requests in average. Because of its computational overhead, the 
custom-wait strategy also produced bad results, even with relocation. For conciseness 
we therefore do not report these strategies in the result plots. 

The 3 different versions of Algorithm[3]we thus consider are the following: GSA df, 
which stands for GSA with drive-first waiting strategy, GSA dfr which stands for GSA 
with drive-first and relocation strategies, and finally GSAro with means GSA using 
relocation-only strategy. Recall that, by definition, the relocation-only strategy involves 
relocation. In addition to those 3 algorithms, as a baseline we consider the GLS df algo¬ 
rithm, which stands for greedy local search with drive-first waiting. This algorithm is 
similar to the dynamic LS described in ll22ll . to which we coupled a Simulated Anneal¬ 
ing metaheuristic. In this algorithm, stochastic information about future request is not 
taken into account and a neighboring solution is solely evaluated by its total travel cost. 

Finally, GSA and GLS are compared to two MSA algorithms, namely MSAr/ and 
MS Ac depending on whether the travel distance or the consensus function are used as 
ranking functions. 

6.2 Benchmarks 

The selected benchmarks are borrowed from 0 which considers a set of benchmarks 
initially designed for the static and deterministic VRPTW in fl25l . each of these con¬ 
taining 100 customers. In our stochastic and dynamic context, each customer becomes 
a request region, where dynamic requests can occur during the online execution. 

The original problems from 0 are divided into 4 classes of 15 instances. Each class 
is characterized by its degree of dynamism (DOD, the ratio of the number of dynamic 
requests revealed at time t > 0 over the number of a priori request known at time t = 0) 
and whether the dynamic requests are known early or lately along the online execution. 
The time horizon H = 480 is divided into 3 time slices. A request is said to be early 
if it is revealed during the first time slice t £ [1,160]. A late request is revealed during 
the second time slice t £ [161, 320]. There is no request revealed during the third time 
slice t £ [321,480], but the vehicles can use it to perform customer operations. 

In Class 1 there are many initial requests, many early requests and very few late 
requests. Class 2 instances have many initial requests, very few early requests and some 
late requests. Class 3 is a mix of classes 1 and 2. In Class 4, there are few initial requests, 
few early requests and many late requests. Finally, classes 1, 2 and 3 have an average 
DOD of 44%, whilst Class 4 has an average DOD of 57%. 

In 0, a fifth class is proposed with a higher DOD of 81% in average. Unfortunately, 
we were not able to get those Class 5 instances. We complete these classes by providing 
a sixth class of instance, with DOD of 100%. Each instance hence contains no initial 
request, an early request with probability 0.3 and a late request with probability 0.7. 

Figure [2] summarizes the different instance classes. 
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pl!61,320j [j] _ Q 7 

pl321,480J|jj _ 0 


Fig. 2. Summary of the test instances, grouped per degree of dynamism. P^ t]t 1 [j] represents the 
probability that a request gets revealed during the time slice defined by interval [t, t']. 


6.3 Results 


Computations are performed on a cluster composed of 32 64-bits AMD Opteron(tm) 
Processor 6284 SE cores, with CPU frequencies ranging from 1400 to 2600 MHz. Ex¬ 
ecutables were developed with C++ and compiled on a Linux Red Hat environment 
with GCC 4.4.7. Average results over 10 runs are reported. In J7], 25 minutes of of¬ 
fline computation are allocated to MSA, in order to decide the first online action at time 
t = 1. During online execution, each time unit within the time horizon was executed 
during 7.5 seconds by the simulation framework. In order to compensate the technol¬ 
ogy difference, we decided in this study to allow only 10 minutes of offline computation 
and 4 seconds of online computation per time unit. Thereafter, in order to highlight the 
contribution of the offline computation in our approach, the amount of time allowed 
at pre-computation is increased to 60 minutes, while each time unit still lasts 4 sec¬ 
onds. According to preliminary experiments, both the size of the scenario pool and the 
resampling rate are set to a = /3 = 150 for all our algorithms except GLS df. 

Figure [3] gives a graphical representation of our algorithms results, through perfor¬ 
mance profiles. Performance profiles provide, for each algorithm, a cumulative distri¬ 
bution of its performance compared to other algorithms. For a given algorithm, a point 
{x, y) on its curve means that, in (100 -y)% of the instances, this algorithm performed at 
most x times worse than the best algorithm on each instance taken separately. Instances 
are grouped by DOD and by offline computation time. Classes 1, 2 and 3 have a DOD 
of 44%, hence they are grouped together. An algorithm is strictly better than another 
one if its curve stays above the other algorithm’s curve. For example on the 60min plot 
of Class 6, GhSdf is the worst algorithm in 95% of Class 6 instances, outperforming 
GS A df in the remaining 5% (but not the other algorithms). On the other hand, pro¬ 
vided these 60 minutes of offline computation, GSAro obtains the best results in 55% 
of the instances, whereas only 30% for GSA df and GSA dfr. See fill for a complete 
description of performance profiles. Detailed results are provided in the appendix. 

Our algorithms compare fairly with MSA, especially on lately dynamic instances 
of Class 4. Given more offline computation, our algorithms get stronger, although that 
MSA benefits of the same offline time in every plots. Surprisingly, GLS df performs 
well compared to other algorithms on classes 1,2 and 3. The low DOD that charac¬ 
terizes these instances tends to lower the contribution of stochastic knowledge against 
the computational power of GLSi if. Indeed, approximating the stochastic evaluation 
function over 150 scenarios is about 10 3 times more expensive than GLSc/f evaluation 
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Fig. 3. Performance profiles on classes [1, 2 ,3], Class 4 and Class 6 problem instances 
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function. However, as the offline computation time and the DOD increase, stochastic 
algorithms tend to outperform their deterministic counterpart. 

We notice that the relocation strategy gets stronger as the offline computation time 
increases. This is due to the computational overhead induced by relocation vertices. 
GSA df is then the good choice under limited offline computation time. However, both 
GSAro and GSA dfr tend to outperform the other strategies when provided enough of¬ 
fline computation and high DOD. 

As it contains no deterministic request, in Class 6 the offline computation is not ap¬ 
plicable to those algorithms that does not exploit the relocation strategy, i.e. GLS df and 
GSA df. Class 6 shows that, despite the huge difference in the number of iterations per¬ 
formed by GLSi if on one hand and stochastic algorithms on the other, the laters clearly 
outperform GLS df under fully dynamic instances. We also notice in this highly dynamic 
context that GSAro tends to outperform GSA dfr as offline computation increases, high¬ 
lighting the anticipative contribution provided by the relocation-only strategy, centering 
waiting times on relocation vertices. 


7 Conclusions 


We proposed GSA, a decision rule for dynamic and stochastic vehicle routing with time 
windows (DS-VRPTW), based on a stochastic programming heuristic approach. Exist¬ 
ing related studies, such as MSA, simplify the problem as a two-stage problem by using 
sample average approximation. In contrary, the theoretical singularity of our method is 
to approximate a multistage stochastic problem through Monte Carlo sampling, using a 
heuristic evaluation function that preserves the nonanticipativity constraints. By main¬ 
taining one unique anticipative solution designed to be as flexible as possible according 
to a set of scenarios, our method differs in practice from MSA which computes as many 
solutions as scenarios, each being specialized for its associated scenario. Experimental 
results show that GSA produces competitive results with respect to state-of-the-art. This 
paper also proposes a new waiting strategy, relocation-only, aiming at taking full benefit 
of relocation strategy. 

In a future study we plan to address a limitation of our solving algorithm which 
embeds GSA, namely the computational cost of its evaluation function. One possible 
direction would be to take more benefit of each evaluation, by spending much more 
computational effort in constructing neighboring solutions, e.g. by using Large Neigh¬ 
borhood Search m. Minimizing the operational cost, such as the total travel distance, 
is usually also important in stochastic VRPs. Studying the aftereffect when incorporat¬ 
ing it as a second objective should be of worth. It is also necessary to consider other 
types of DS-VRPTW instances, such as problem sets closer to public or good trans¬ 
portation. Finally, the conclusions we made in section 2 about the shortcoming of a 
two-stage formulation (showed in Fig. 1) are theoretical only, and should be experi¬ 
mentally assessed. 
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Appendix: Detailed experimental results 

The present appendix provides detailed experimental results on the different instance 
classes. 

Tables [I] [2] [3] and [4] show detailed results on each algorithms for classes 1-4, with 
both 10 and 60 minutes allowed at offline computation. Note that both MS At/ and 
MS Ac should only be compared with our algorithms when allowing 10 minutes at of¬ 
fline computation. Table [5] shows the results obtained on Class 6 problem instances. 
Since the instances belonging to Class 6 contain no deterministic request, offline com¬ 
putation can only be performed on these instance when allowing relocation. 
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Class 1 


10 min 

. offline computation 


60 min. offline comput. 


MSA 

GLS 


GSA 


GLS 


GSA 


Instance 

d 

C 

df 

df 

dfr 

ro 

df 

df 

dfr 

ro 

RC 101-1 

1.0 

0.6 

0.5 

2.6 

1.1 

3.3 

1.2 

1.1 

2.9 

1.2 

RC 101-2 

1.8 

2.6 

1.7 

2.1 

2.8 

2.6 

1.3 

1.6 

1.4 

1.6 

RC 101-3 

1.8 

1.0 

1.7 

2.8 

3.0 

3.3 

1.8 

2.0 

2.3 

1.6 

RC 101-4 

0.0 

0.2 

0.5 

0.8 

1.7 

2.7 

0.4 

1.5 

1.1 

1.1 

RC 101-5 

0.4 

1.0 

0.7 

2.5 

2.5 

2.7 

0.8 

0.9 

1.5 

2.2 

Avg 

1.0 

1.1 

1.0 

2.2 

2.2 

2.9 

1.1 

1.4 

1.8 

1.5 

RC 102-1 

2.0 

2.4 

0.8 

2.0 

2.6 

2.9 

1.1 

2.5 

2.0 

1.5 

RC 102-2 

0.4 

0.8 

1.0 

0.8 

1.4 

1.9 

1.9 

0.8 

1.4 

0.8 

RC 102-3 

1.0 

0.8 

0.8 

0.7 

1.7 

1.2 

0.5 

0.3 

0.4 

0.8 

RC 102-4 

1.2 

1.4 

1.6 

0.5 

0.4 

0.8 

1.7 

0.5 

0.4 

0.5 

RC 102-5 

1.2 

0.6 

0.4 

0.6 

0.3 

0.5 

0.6 

0.4 

0.0 

0.1 

Avg 

1.2 

1.2 

0.9 

0.9 

1.3 

1.5 

1.2 

0.9 

0.8 

0.7 

RC 104-1 

0.0 

0.2 

0.1 

0.0 

0.7 

1.1 

0.1 

0.1 

0.1 

0.4 

RC 104-2 

0.0 

0.0 

0.0 

0.0 

0.0 

0.3 

0.2 

0.0 

0.1 

0.0 

RC 104-3 

0.0 

0.0 

0.0 

0.0 

0.0 

0.0 

0.0 

0.0 

0.0 

0.0 

RC 104-4 

0.0 

0.2 

0.0 

0.0 

0.1 

0.1 

0.1 

0.1 

0.1 

0.2 

RC 104-5 

0.0 

0.0 

0.0 

0.0 

0.0 

0.0 

0.0 

0.0 

0.0 

0.0 

Avg 

0.0 

0.0 

0.0 

0.0 

0.2 

0.3 

0.1 

0.0 

0.1 

0.1 

AVG 

0.7 

0.8 

0.6 

1.0 

1.2 

1.6 

0.8 

0.8 

0.9 

0.8 


Table 1 . Detailed results for Class 1 problem instances. Averages over 10 runs. 


Class 2 10 min. offline computation 60 min. offline comput. 



MSA 

GLS 


GSA 


GLS 


GSA 


Instance 

d 

c 

df 

df 

dfr 

ro 

df 

df 

dfr 

ro 

RC 101-1 

0.0 

0.2 

0.4 

1.5 

1.2 

2.2 

0.1 

0.6 

0.9 

1.5 

RC 101-2 

1.0 

1.4 

1.9 

2.3 

4.2 

3.2 

1.7 

2.3 

3.2 

2.1 

RC 101-3 

0.0 

0.0 

1.5 

3.2 

3.2 

3.3 

2.2 

2.6 

2.0 

2.3 

RC 101-4 

0.6 

0.8 

1.2 

3.2 

4.7 

3.9 

1.9 

3.0 

2.6 

2.7 

RC 101-5 

1.6 

1.4 

1.6 

2.5 

3.2 

1.7 

1.2 

1.4 

1.7 

2.1 

Avg 

0.6 

0.8 

1.3 

2.5 

3.3 

2.9 

1.4 

2.0 

2.1 

2.1 

RC 102-1 

0.0 

0.4 

0.0 

1.1 

1.2 

1.0 

0.0 

0.9 

0.7 

0.4 

RC 102-2 

0.6 

1.2 

0.6 

0.6 

1.1 

1.0 

0.7 

1.0 

0.8 

0.8 

RC 102-3 

2.0 

2.0 

0.4 

1.5 

1.1 

1.3 

1.0 

1.1 

1.2 

1.0 

RC 102-4 

0.2 

0.4 

0.9 

0.4 

0.4 

1.1 

0.6 

0.2 

0.5 

0.8 

RC 102-5 

2.6 

2.8 

2.1 

2.3 

2.3 

2.7 

1.5 

1.5 

1.5 

1.3 

Avg 

1.1 

1.4 

0.8 

1.2 

1.2 

1.4 

0.8 

0.9 

0.9 

0.9 

RC 104-1 

6.2 

3.0 

2.2 

0.5 

0.6 

0.3 

2.4 

0.0 

0.1 

0.1 

RC 104-2 

5.4 

2.6 

2.6 

0.3 

2.5 

2.9 

2.4 

0.2 

0.2 

0.6 

RC 104-3 

2.0 

0.8 

1.1 

0.0 

0.0 

0.8 

0.9 

0.0 

0.0 

0.0 

RC 104-4 

0.8 

0.6 

0.1 

0.0 

0.0 

0.0 

0.4 

0.0 

0.0 

0.0 

RC 104-5 

4.2 

0.2 

1.5 

0.1 

0.2 

0.1 

0.7 

0.0 

0.0 

0.1 

Avg 

3.7 

1.4 

1.5 

0.2 

0.7 

0.8 

1.4 

© 

© 

0.1 

0.2 

AVG 

1.8 

1.2 

1.2 

1.3 

1.7 

1.7 

1.2 

1.0 

1.0 

1.0 


Table 2. Detailed results for Class 2 problem instances. Averages over 10 runs. 
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Class 3 10 min. offline computation 60 min. offline comput. 



MSA 

GLS 


GSA 


GLS 


GSA 


Instance 

d 

c 

df 

df 

dfr 

ro 

df 

df 

dfr 

ro 

RC 101-1 

0.6 

0.8 

0.9 

1.9 

2.8 

2.5 

0.9 

1.5 

2.1 

1.8 

RC 101-2 

2.2 

1.4 

0.9 

1.4 

2.7 

1.2 

1.0 

0.5 

1.2 

1.2 

RC 101-3 

0.6 

0.8 

1.0 

2.0 

2.6 

2.7 

1.1 

2.5 

1.6 

1.5 

RC 101-4 

0.6 

1.0 

1.3 

2.1 

2.1 

1.8 

0.9 

1.5 

1.9 

1.4 

RC 101-5 

0.0 

0.8 

0.4 

1.0 

1.3 

1.2 

0.5 

1.2 

0.8 

0.7 

Avg 

0.8 

1.0 

0.9 

1.7 

2.3 

1.9 

0.9 

1.4 

1.5 

1.3 

RC 102-1 

1.6 

1.6 

1.4 

1.6 

1.7 

1.4 

1.3 

1.1 

1.4 

1.0 

RC 102-2 

0.8 

1.8 

2.8 

0.8 

1.9 

0.7 

2.4 

0.3 

0.4 

0.8 

RC 102-3 

0.8 

0.8 

0.6 

0.8 

0.8 

1.4 

0.9 

0.5 

0.4 

0.5 

RC 102-4 

0.8 

1.8 

1.1 

0.9 

1.7 

1.5 

1.0 

0.8 

1.2 

0.4 

RC 102-5 

1.4 

1.6 

1.3 

0.7 

1.4 

1.1 

1.8 

0.7 

1.0 

1.1 

Avg 

1.1 

1.5 

1.4 

1.0 

1.5 

1.2 

1.5 

0.7 

0.9 

0.8 

RC 104-1 

4.8 

2.4 

3.4 

0.3 

0.7 

0.6 

3.4 

0.0 

0.3 

0.3 

RC 104-2 

1.0 

0.2 

0.3 

0.0 

0.0 

0.1 

0.3 

0.0 

0.0 

0.1 

RC 104-3 

1.4 

0.4 

0.3 

0.0 

0.0 

0.0 

0.2 

0.0 

0.0 

0.0 

RC 104-4 

1.6 

0.2 

0.8 

0.1 

0.9 

0.3 

1.1 

0.0 

0.1 

0.0 

RC 104-5 

2.2 

0.6 

0.4 

0.0 

0.0 

0.1 

0.3 

0.1 

0.0 

0.0 

Avg 

2.2 

0.8 

1.0 

0.1 

0.3 

0.2 

1.1 

0.0 

0.1 

0.1 

AVG 

1.4 

1.1 

1.1 

0.9 

1.4 

1.1 

1.1 

0.7 

0.8 

0.7 


Table 3. Detailed results for Class 3 problem instances. Averages over 10 runs. 


Class 4 10 min. offline computation 60 min. offline comput. 



MSA 

GLS 


GSA 


GLS 


GSA 


Instance 

d 

c 

df 

df 

dfr 

ro 

df 

df 

dfr 

ro 

RC 101-1 

0.0 

1.0 

0.3 

1.2 

1.3 

1.3 

0.3 

1.5 

0.7 

1.2 

RC 101-2 

2.8 

3.6 

2.3 

1.9 

2.1 

2.6 

2.1 

2.0 

1.8 

1.6 

RC 101-3 

0.0 

1.6 

0.2 

0.7 

1.2 

1.0 

0.5 

0.8 

0.5 

0.2 

RC 101-4 

1.0 

1.4 

0.7 

2.0 

2.9 

1.3 

1.1 

1.0 

1.2 

1.1 

RC 101-5 

2.0 

2.2 

2.1 

3.0 

3.9 

3.8 

1.5 

2.2 

1.1 

3.5 

Avg 

1.2 

2.0 

1.1 

1.8 

2.3 

2.0 

1.1 

1.5 

1.1 

1.5 

RC 102-1 

0.0 

0.4 

0.1 

0.3 

0.2 

0.0 

0.1 

0.1 

0.1 

0.1 

RC 102-2 

1.6 

1.4 

0.7 

0.7 

1.0 

0.9 

0.8 

1.2 

0.1 

0.2 

RC 102-3 

0.2 

1.4 

0.6 

1.7 

1.5 

2.1 

1.1 

1.3 

1.3 

0.5 

RC 102-4 

0.0 

0.0 

0.1 

0.7 

0.6 

0.7 

0.4 

0.2 

0.1 

0.1 

RC 102-5 

0.6 

0.6 

0.5 

0.6 

1.2 

1.6 

0.7 

1.0 

1.0 

1.5 

Avg 

0.5 

0.7 

0.4 

0.8 

0.9 

1.1 

0.6 

0.8 

0.5 

0.5 

RC 104-1 

15.6 

3.2 

6.7 

1.4 

2.0 

0.9 

5.0 

0.7 

0.9 

0.7 

RC 104-2 

16.0 

3.4 

4.6 

0.3 

0.3 

0.4 

4.9 

0.2 

0.3 

0.0 

RC 104-3 

13.8 

5.6 

5.0 

1.4 

1.4 

1.9 

5.5 

1.3 

0.9 

0.5 

RC 104-4 

15.6 

2.4 

7.3 

0.5 

2.0 

0.2 

6.5 

0.4 

0.1 

0.4 

RC 104-5 

8.2 

2.0 

3.6 

0.5 

0.5 

0.4 

4.1 

1.0 

0.2 

0.7 

Avg 

13.8 

3.3 

5.4 

0.8 

1.2 

0.8 

5.2 

0.7 

0.5 

0.5 

AVG 

5.2 

2.0 

2.3 

1.1 

1.5 

1.3 

2.3 

1.0 

0.7 

0.8 


Table 4 . Detailed results for Class 4 problem instances. Averages over 10 runs. 
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Class 6 

No 

offline 

10 min. offline 

60 min. offline 


computation 

computation 

computation 


GLS 

GSA 

GSA 

GSA 

Instance 

df 

df 

dfr 

ro 

dfr 

ro 

RC 101-1 

6.2 

4.8 

3.8 

4.7 

4.1 

3.7 

RC 101-2 

5.7 

3.2 

2.4 

2.8 

3.4 

2.5 

RC 101-3 

2.7 

2.2 

4.2 

3.6 

2.6 

3.3 

RC 101-4 

6.9 

4.9 

4.4 

3.6 

4.0 

3.5 

RC 101-5 

3.9 

4.0 

4.4 

3.8 

3.2 

3.7 

Avg 

5.1 

3.8 

3.8 

3.7 

3.5 

3.3 

RC 102-1 

3.7 

2.1 

3.9 

4.2 

3.1 

3.8 

RC 102-2 

2.3 

3.3 

1.0 

0.7 

1.4 

0.6 

RC 102-3 

2.0 

0.9 

1.9 

1.4 

0.6 

0.9 

RC 102-4 

2.4 

1.8 

1.4 

1.2 

1.4 

1.1 

RC 102-5 

5.4 

4.3 

3.5 

4.2 

4.0 

2.6 

Avg 

3.2 

2.5 

2.3 

2.3 

2.1 

1.8 

RC 104-1 

17.6 

10.2 

10.1 

10.2 

10.8 

8.8 

RC 104-2 

15.2 

12.4 

13.5 

12.0 

11.7 

11.9 

RC 104-3 

16.6 

12.8 

14.5 

14.3 

13.2 

12.7 

RC 104-4 

16.7 

16.5 

18.8 

17.7 

16.2 

17.1 

RC 104-5 

16.3 

11.3 

14.8 

13.8 

14.2 

15.9 

Avg 

16.5 

12.6 

14.3 

13.6 

13.2 

13.3 

AVG 

8.2 

6.3 

6.8 

6.5 

6.3 

6.1 


Table 5. Detailed results for Class 6 problem instances. Averages over 10 runs. 
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